On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics.
نویسندگان
چکیده
In shotgun proteomics, the quality of a hypothesized match between an observed spectrum and a peptide sequence is quantified by a score function. Because the score function lies at the heart of any peptide identification pipeline, this function greatly affects the final results of a proteomics assay. Consequently, valid statistical methods for assessing the quality of a given score function are extremely important. Previously, several research groups have used samples of known protein composition to assess the quality of a given score function. We demonstrate that this approach is problematic, because the outcome can depend on factors other than the score function itself. We then propose an alternative use of the same type of data to validate a score function. The central idea of our approach is that database matches that are not explained by any protein in the purified sample comprise a robust representation of incorrect matches. We apply our alternative assessment scheme to several commonly used score functions, and we show that our approach generates a reproducible measure of the calibration of a given peptide identification method. Furthermore, we show how our quality test can be useful in the development of novel score functions.
منابع مشابه
Determining the calibration of confidence estimation procedures for unique peptides in shotgun proteomics.
The analysis of a shotgun proteomics experiment results in a list of peptide-spectrum matches (PSMs) in which each fragmentation spectrum has been matched to a peptide in a database. Subsequently, most protein inference algorithms rank peptides according to the best-scoring PSM for each peptide. However, there is disagreement in the scientific literature on the best method to assess the statist...
متن کاملThe accuracy of statistical confidence estimates in shotgun proteomics
High-throughput techniques are currently some of the most promising methods to study molecular biology, with the potential to improve medicine and enable new biological applications. In proteomics, the large scale study of proteins, the leading method is mass spectrometry. At present researchers can routinely identify and quantify thousands of proteins in a single experiment with the technique ...
متن کاملOn the Importance of Well-Calibrated Scores for Identifying Shotgun Proteomics Spectra
Identifying the peptide responsible for generating an observed fragmentation spectrum requires scoring a collection of candidate peptides and then identifying the peptide that achieves the highest score. However, analysis of a large collection of such spectra requires that the score assigned to one spectrum be well-calibrated with respect to the scores assigned to other spectra. In this work, w...
متن کاملEffect of Laparoscopic Gastric Plication on the Blood Protein Profile of Obese Subjects Using Shotgun Proteomics
Introduction: Nowadays, bariatric surgery is considered to be the most effective technique in the treatment of morbid obesity. In the current study, the effect of Laparoscopic Gastric Plication (LGP), a new technique, on the serum protein profile of obese patients has been investigated following surgery. Materials and Methods: Serum of 16 obese subjects with mean body mass index (BMI) of 41.2±5...
متن کاملMSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance sc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of proteome research
دوره 10 5 شماره
صفحات -
تاریخ انتشار 2011